Using DTW to compare sounds

Using librosa for instance, you can easily extract the MFCC of sound.

Compute the MFCCs of two sounds


In [1]:
import librosa

y1, sr1 = librosa.load('../../Downloads/tmp/sounds/10.wav')
y2, sr2 = librosa.load('../../Downloads/tmp/sounds/78.wav')

In [2]:
%pylab inline

subplot(1, 2, 1)
mfcc1 = librosa.feature.mfcc(y1, sr1)
librosa.display.specshow(mfcc1)

subplot(1, 2, 2)
mfcc2 = librosa.feature.mfcc(y2, sr2)
librosa.display.specshow(mfcc2)


Populating the interactive namespace from numpy and matplotlib
Out[2]:
<matplotlib.image.AxesImage at 0x11276bd10>

Compare them using DTW


In [3]:
from dtw import dtw

In [4]:
dist, cost, path = dtw(mfcc1.T, mfcc2.T)
print 'Normalized distance between the two sounds:', dist


Normalized distance between the two sounds: 192.489808008

In [5]:
imshow(cost.T, origin='lower', cmap=cm.gray, interpolation='nearest')
plot(path[0], path[1], 'w')
xlim((-0.5, cost.shape[0]-0.5))
ylim((-0.5, cost.shape[1]-0.5))


Out[5]:
(-0.5, 37.5)

In [ ]: